Automatic Overheads Profiler for OpenMP Codes
نویسنده
چکیده
To develop a good parallel implementation requires understanding of where run-time is spent and comparing this to some realistic best possible time. We introduce “overhead analysis” as a way of comparing achieved performance with achievable performance. We present a tool, OVALTINE, which aims to provide, automatically, the user with a hierarchical set of overheads for a given OpenMP implementation with respect to a given serial implementation. We give preliminary results from OVALTINE on an SGI Origin2000, and show how our tool can be used to improve performance.
منابع مشابه
Composing Low-Overhead Scheduling Strategies for Improving Performance of Scientific Applications
Many different sources of overheads impact the efficiency of a scheduling strategy applied to a parallel loop within a scientific application. In prior work, we handled these overheads using multiple loop scheduling strategies, with each scheduling strategy focusing on mitigating a subset of the overheads. However, mitigating the impact of one source of overhead can lead to an increase in the i...
متن کاملOpenMP benchmark using PARKBENCH
Real application codes in OpenMP obviously measure the performance of OpenMP programming on the real problems. Although this is ultimately what the end-user wants, the full real applications are often complex and large. In order to obtain a guide to the performance of OpenMP parallel programs in any given parallel systems, kernel and synthetic benchmarks are useful. PARKBENCH[4] is a set of ben...
متن کاملPerformance Analysis of Shared-Memory Parallel Applications Using Performance Properties
Tuning parallel code can be a time-consuming and difficult task. We present our approach to automate the performance analysis of OpenMP applications that is based on the notion of performance properties. Properties are formally specified in the APART specification language (ASL) with respect to a specific data model. We describe a data model for summary (profiling) data of OpenMP applications a...
متن کاملExperiences with OpenMP in tmLQCD
An overview is given of the lessons learned from the introduction of multi-threading using OpenMP in tmLQCD. In particular, programming style, performance measurements, cache misses, scaling, thread distribution for hybrid codes, race conditions, the overlapping of communication and computation and the measurement and reduction of certain overheads are discussed. Performance measurements and sa...
متن کاملReparallelization techniques for migrating OpenMP codes in computational grids
Typical computational grid users target only a single cluster and have to estimate the runtime of their jobs. Job schedulers prefer short-running jobs to maintain a high system utilization. If the user underestimates the runtime, premature termination causes computation loss; overestimation is penalized by long queue times. As a solution, we present an automatic reparallelization and migration ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2000